首页> 外文OA文献 >Faster Algorithms for Large-scale Machine Learning using Simple Sampling Techniques

【2h】

Faster Algorithms for Large-scale Machine Learning using Simple Sampling Techniques

机译：使用简单采样进行大规模机器学习的快速算法技术

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Now a days, the major challenge in machine learning is the `Big~Data'challenge. The big data problems due to large number of data points or largenumber of features in each data point, or both, the training of models havebecome very slow. The training time has two major components: Time to accessthe data and time to process the data. In this paper, we have proposed onepossible solution to handle the big data problems in machine learning. Thefocus is on reducing the training time through reducing data access time byproposing systematic sampling and cyclic/sequential sampling to selectmini-batches from the dataset. To prove the effectiveness of proposed samplingtechniques, we have used Empirical Risk Minimization, which is commonly usedmachine learning problem, for strongly convex and smooth case. The problem hasbeen solved using SAG, SAGA, SVRG, SAAG-II and MBSGD (Mini-batched SGD), eachusing two step determination techniques, namely, constant step size andbacktracking line search method. Theoretical results prove the same convergencefor systematic sampling, cyclic sampling and the widely used random samplingtechnique, in expectation. Experimental results with bench marked datasetsprove the efficacy of the proposed sampling techniques.

机译：如今，机器学习的主要挑战是“大数据”挑战。由于大量数据点或每个数据点中有大量特征或两者而导致的大数据问题，模型的训练变得非常缓慢。培训时间有两个主要部分：访问数据的时间和处理数据的时间。在本文中，我们提出了一种可能的解决方案来处理机器学习中的大数据问题。重点是通过提出系统采样和循环/顺序采样以从数据集中选择微型批次，以通过减少数据访问时间来减少训练时间。为了证明所提出的抽样技术的有效性，对于强凸和光滑的情况，我们使用了经验风险最小化（这是机器学习中常用的问题）。使用SAG，SAGA，SVRG，SAAG-II和MBSGD（小批量SGD）已解决了该问题，每种方法都使用两种步长确定技术，即恒定步长和回溯线搜索方法。理论结果证明，系统采样，循环采样和广泛使用的随机采样技术具有相同的收敛性。具有基准标记数据集的实验结果证明了所提出的采样技术的有效性。

著录项

作者
Chauhan, Vinod Kumar; Dahiya, Kalpana; Sharma, Anuj;
展开▼
作者单位

展开▼
年度 2018
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. Simple and Fast Algorithms for Interactive Machine Learning with Random Counter-examples [J] . Jagdeep Singh Bhatia Journal of machine learning research . 2021,第a期

机译：随机反例的交互式机器学习简单快速算法
2. A simple and practical control of the authenticity of organic sugarcane samples based on the use of machine-learning algorithms and trace elements determination by inductively coupled plasma mass spectrometry [J] . Barbosa Rommel M., Batista Bruno L., Bariao Camila V., Food Chemistry . 2015,第octa1期

机译：通过使用机器学习算法和电感耦合等离子体质谱法测定痕量元素，对有机甘蔗样品的真实性进行简单而实用的控制
3. Fast and robust adaptive beamforming algorithms for large-scale arrays with small samples [J] . Xue-Jun Zhang, Hu Xie, Da-Zheng Feng, Signal processing . 2021,第Nova期

机译：具有小型样品的大型阵列的快速和鲁棒自适应波束形成算法
4. Using Sampling Techniques and Machine Learning Algorithms to Improve Big Five Personality Traits Recognition from Non-verbal Cues [C] . Dina Al-Hammadi, Roger K. Moore IEEE National Computing Colleges Conference . 2021

机译：使用采样技术和机器学习算法来改善非口头线索的大五个人格特征识别
5. Faster Algorithms for Machine Learning Problems in High Dimension [D] . Ye, Mingquan 2019

机译：高维层机器学习问题的速度速度较快
6. Erratum to: Combining semi-automated image analysis techniques with machine learning algorithms to accelerate large-scale genetic studies [O] . Jonathan A Atkinson, Guillaume Lobet, Manuel Noll, 2018

机译：勘误到：将半自动图像分析技术与机器学习算法相结合以加速大规模遗传研究
7. Combining semi-automated image analysis techniques with machine learning algorithms to accelerate large-scale genetic studies [O] . Atkinson Jonathan A., Lobet Guillaume, Noll Manuel, 2017

机译：将半自动图像分析技术与机器学习算法相结合，以加速大规模遗传研究

Faster Algorithms for Large-scale Machine Learning using Simple Sampling Techniques

摘要

著录项

相似文献

相关主题

期刊订阅